-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Implement a new --failing-and-slow-first command line argument to test runner. #24624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…t runner. This keeps track of results of previous test run, and on subsequent runs, failing tests are run first, then skipped tests, and last, successful tests in slowest-first order. Add support for --failfast in the multithreaded test suite. This improves parallelism throughput of the suite, and helps stop at test failures quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this is what I currently use --failfast --continue
for. The downside of --failfast --continue
of course is that it doesn't work for parallel testing (so I also add -j1
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually maybe I misunderstood. I use --failfast --continue
when implementing new features and wanting to fix each test failure as I run into it.
How does this improve CI times on the bots? It seems like it would not effect the first run, but only subsequent runs, which the bots don't do, do they?
It doesn't work on the current CircleCI bots, which always start from a clean slate and run all suites from a single command invocation, but it does help if a developer runs test suites locally, and on the ad hoc CI I am running in http://clbri.com:8010/ . For example, here is one such run: ![]() where all the failing suites fail in a matter of a few seconds, rather than taking a random length to fail. Also passing suites run faster, since shortest tests are run last, meaning that core utilization will be 100% throughout the test suite run. It is like a self-calibrating version to avoid having to name tests |
This keeps track of results of previous test run, and on subsequent runs, failing tests are run first, then skipped tests, and last, successful tests in slowest-first order. This improves parallelism throughput of the suite.
Add support for --failfast in the multithreaded test suite to help stop suite runs at first test failures quickly.
These two flags
--failfast
and--failing-and-slow-first
together can help achieve < 10 second test suite runs on a CI when the suite is failing.Example
core0
runtime withtest/runner core0
on a 16-core/32-thread system:Same suite runtime with
test/runner --failing-and-slow-first core0
:Gaining a better throughput and a -20.37% test suite wall time.